Evaluating generative patent language models
نویسندگان
چکیده
Generative language models are promising for assisting human writing in various domains. This manuscript aims to build generative the patent domain and evaluate model performance from a human-centric perspective. The perspective is measure ratio of keystrokes that can be saved by autocompletion based on models. A higher means more effective which save keystrokes. metric used benchmark performance. different conventional machine-centric metrics token-based instead keystroke-based. In terms size, largest built this 6B, state-of-the-art domain. Based metric, it found not necessarily best metric. finding keeping increasing sizes might unnecessary if purpose assist with autocompletion. Several pre-trained scratch research. released future researchers. visualization tools also provided. importance building potential facilitate creativity innovations future.
منابع مشابه
Evaluating Generative Models for Text Generation
Generating human quality text is a challenging problem because of ambiguity of meaning and difficulty in modeling long term semantic connections. Recurrent Neural Networks (RNNs) have shown promising results in this problem domain, with the most common approach to its training being to maximize the log predictive likelihood of each true token in the training sequence given the previously observ...
متن کاملEvaluating Implicit Generative Models with Large Samples
We study the problem of evaluating a generative model using only a finite sample from the model. For many common evaluation functions, generalization is meaningless because trivially memorizing the training set attains a better score than the models we consider state-of-the-art. We clarify a necessary condition for an evaluation function not to behave this way: estimating the function must requ...
متن کاملChurch: a language for generative models
Formal languages for probabilistic modeling enable re-use, modularity, and descriptive clarity, and can foster generic inference techniques. We introduce Church, a universal language for describing stochastic generative processes. Church is based on the Lisp model of lambda calculus, containing a pure Lisp as its deterministic subset. The semantics of Church is defined in terms of evaluation hi...
متن کاملGenerative Knowledge Transfer for Neural Language Models
In this paper, we propose a generative knowledge transfer technique that trains an RNN based language model (student network) using text and output probabilities generated from a previously trained RNN (teacher network). The text generation can be conducted by either the teacher or the student network. We can also improve the performance by taking the ensemble of soft labels obtained from multi...
متن کاملRefining Generative Language Models using Discriminative Learning
We propose a new approach to language modeling which utilizes discriminative learning methods. Our approach is an iterative one: starting with an initial language model, in each iteration we generate 'false' sentences from the current model, and then train a classifier to discriminate between them and sentences from the training corpus. To the extent that this succeeds, the classifier is incorp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: World Patent Information
سال: 2023
ISSN: ['0172-2190', '1874-690X']
DOI: https://doi.org/10.1016/j.wpi.2023.102173